Objective Weight Interval Estimation Using Adversarial Inverse Reinforcement Learning

نویسندگان

چکیده

Several real-world problems are modeled as multi-objective sequential decision-making with multiple competing objectives, and reinforcement learning (MORL) has garnered attention a solution to this problem. One of the challenges in obtaining desired policy using MORL is that priorities (hereafter, weights) for each objective must be designed advance scalarize reward vector. Determining weights through trial-and-error burdens system designers, methods estimate needed. The existing use inverse (IRL), which not scalable because it requires several times until an optimal obtained. This study proposes weight interval estimation (WInter) method adversarial IRL (AIRL). AIRL framework reduces computational complexity by simultaneously estimating rewards policies. WInter estimates expert neighborhoods obtained during training. We successfully estimated experiments benchmark environment continuous state space while reducing compared methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Robust Rewards with Adversarial Inverse Reinforcement Learning

Reinforcement learning provides a powerful and general framework for decision making and control, but its application in practice is often hindered by the need for extensive feature and reward engineering. Deep reinforcement learning methods can remove the need for explicit engineering of policy or value features, but still require a manually specified reward function. Inverse reinforcement lea...

متن کامل

OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning

Reinforcement learning has shown promise in learning policies that can solve complex problems. However, manually specifying a good reward function can be difficult, especially for intricate tasks. Inverse reinforcement learning offers a useful paradigm to learn the underlying reward function directly from expert demonstrations. Yet in reality, the corpus of demonstrations may contain trajectori...

متن کامل

Generalizing Adversarial Reinforcement Learning

Reinforcement Learning has been used for a number of years in single agent environments. This article reports on our investigation of Reinforcement Learning techniques in a multi-agent and adversarial environment with continuous observable state information. Our framework for evaluating algorithms is two-player hexagonal grid soccer. We introduce an extension to Prioritized Sweeping that allows...

متن کامل

Adversarial Reinforcement Learning

Reinforcement Learning has been used for a number of years in single agent environments. This article reports on our investigation of Reinforcement Learning techniques in a multi-agent and adversarial environment with continuous observable state information. We introduce a new framework, two-player hexagonal grid soccer, in which to evaluate algorithms. We then compare the performance of severa...

متن کامل

Robust Adversarial Reinforcement Learning

Deep neural networks coupled with fast simulation and improved computation have led to recent successes in the field of reinforcement learning (RL). However, most current RL-based approaches fail to generalize since: (a) the gap between simulation and real world is so large that policy-learning approaches fail to transfer; (b) even if policy learning is done in real world, the data scarcity lea...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2023

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2023.3281593